A Study of Differentially Private Frequent Itemset Mining

نویسنده

  • Trupti Kenekar
چکیده

Frequent sets play an important role in many Data Mining tasks that try to search interesting patterns from databases, such as association rules, sequences, correlations, episodes, classifiers and clusters. FrequentItemsets Mining (FIM) is the most well-known techniques to extract knowledge from dataset. In this paper differential privacy aims to get means to increase the accuracy of queries from statistical databases while minimizing the chances of identifying its records and itemset. We studied algorithm consists of a preprocessing phase as well as a mining phase. We under seek the applicability of FIM techniques on the MapReduce platform, transaction splitting. We analyzed how differentially private frequent itemset mining of existing system as well.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On differentially private frequent itemset mining

We consider differentially private frequent itemset mining. We begin by exploring the theoretical difficulty of simultaneously providing good utility and good privacy in this task. While our analysis proves that in general this is very difficult, it leaves a glimmer of hope in that our proof of difficulty relies on the existence of long transactions (that is, transactions containing many items)...

متن کامل

Privacy Preserving Private Frequent Itemset Mining via Smart Splitting

Recently there has been a growing interest in designing differentially private data mining algorithms. A variety of algorithms have been proposed for mining frequent itemsets. Frequent itemset mining (FIM) is one of the most fundamental problems in data mining. It has practical importance in a wide range of application areas such as decision support, web usage mining, bioinformatics, etc. In th...

متن کامل

A New Algorithm for High Average-utility Itemset Mining

High utility itemset mining (HUIM) is a new emerging field in data mining which has gained growing interest due to its various applications. The goal of this problem is to discover all itemsets whose utility exceeds minimum threshold. The basic HUIM problem does not consider length of itemsets in its utility measurement and utility values tend to become higher for itemsets containing more items...

متن کامل

CS 730R: Topics in Data and Information Management

1. Summary. In this paper the authors propose a differentially privacy preserving algorithm for mining frequent itemset. This work differs from the other privacy preserving miners present in literature, indeed this algorithm mines the itemset by enforcing cardinality constraints on the transactions present in the dataset. In particular the authors study how the reduction the cardinality of the ...

متن کامل

A Survey on Moving Towards Frequent Pattern Growth for Infrequent Weighted Itemset Mining

Data Mining and knowledge discovery is one of the important areas. In this paper we are presenting a survey on various methods for frequent pattern mining. From the past decade, frequent pattern mining plays a very important role but it does not consider the weight factor or value of the items. The very first and basic technique to find the correlation of data is Association Rule Mining. In ARM...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015